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RELATED APPLICATION 
This application claims priority under 35 U.S.C. § 1 19(e) based on U.S. Provisional 
Application Serial No. 60/216,530, entitled DATA ENTRY AND SEARCH FOR HANDHELD 
DEVICES, filed on July 6, 2000, the disclosure of which is herein incorporated by reference 
5 in its entirety. 

BACKGROUND OF THE INVENTION 

A. Field of the Invention 

This invention relates generally to methods and apparatus for providing search 
10 results in response to an ambiguous search query provided by a user. 

B. Description of the Related Art 

Most search engines operate under the assumption that the end user is entering 
search queries using something like a conventional keyboard, where the input of 
alphanumeric strings is not difficult. As small devices become more common, however, 

15 this assumption is not always valid. For example, users may query search engines using 
a wireless telephone that supports the WAP (Wireless Application Protocol) standard. 
Devices such as wireless telephones typically have a data input interface wherein a 
particular action by the user (e.g., pressing a key) may correspond to more than one 
alphanumeric character. A detailed description of WAP architecture is available at 

20 http://www1 .wapforum.org/tech/documents/SPEC-WAPArch-1 9980439.pdf ("WAP 1 00 
Wireless Application Protocol Architecture Specification"). 

In the usual case, the WAP user navigates to the search query page, and is 
presented with a form into which they input their search query. With conventional 
methods, the user may be required to press multiple keys to select a particular letter. On 

25 a standard telephone keypad, for example, the user would select the letter "b" by 
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pressing the "2" key twice, or would select the letter "s" by pressing the "7" key four time. 
Accordingly, to enter a query for "ben smith", the user would ordinarily need to enter the 
following string of keypresses: 223366077776444844, which map to letters as follows: 



44 ->h 

After the user has entered their search request, the search engine receives the word or 
15 words from the user, and proceeds in much the same manner as if it had received the 

request from a desktop browser wherein the user employed a conventional keyboard. 

As can be seen from the foregoing example, this form of data entry is inefficient in 

that it requires eighteen keystrokes to enter the nine alphanumeric characters (including 

the space) corresponding to "ben smith". Others have attempted to overcome the 
20 limitations imposed by reduced data entry devices, but each of the approaches 

developed thus far has shortcomings. There remains, therefore, a need for methods and 

apparatus for providing relevant search results in response to an ambiguous search 

query. 

25 SUMMARY OF THE INVENTION 

Methods and apparatus consistent with the present invention, as embodied and 
broadly described herein, provide relevant search results in response to an ambiguous 
search query. Consistent with the invention, such a method includes receiving a sequence 
of ambiguous information components from a user. The method obtains mapping 

30 information that maps the ambiguous information components to less ambiguous 
information components. This mapping information is used to translate the sequence of 



5 



22 ->b 
33 ->e 
66 -> n 



0 -> space 



10 



7777 -> s 
6 -> m 
444 -> i 
8->t 
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ambiguous information components into one or more corresponding sequences of less 
ambiguous information components. One or more of tliese sequences of less ambiguous 
information are provided as an input to a searcli engine. The search results are obtained 
from the search engine and are presented to the user. 

5 

BRIEF DESCRIPTION OF THE DRAWINGS 
The accompanying drawings, which are incorporated in, and constitute a part of, this 
specification illustrate an embodiment of the invention and, together with the description, 
serve to explain the advantages and principles of the invention. In the drawings, 
10 FIG. 1 illustrates a block diagram of a system in which methods and apparatus 

consistent with the present invention map be implemented; 

FIG. 2 illustrates a block diagram of a client device, consistent with the invention; 
FIG. 3 illustrates a diagram depicting three documents; 
FIG. 4a illustrates a conventional alphanumeric index; 
15 FIG. 4b illustrates a flow diagram for providing search results in response to a 

conventional alphanumeric search query; 

FIG. 5a illustrates a flow diagram, consistent with the invention, for providing search 
results in response to an ambiguous search query; 

FIG. 5b illustrates a diagram for mapping alphanumeric information to numeric 
20 information; and 

FIG. 6 illustrates another flow diagram, consistent with the invention, for providing 
search results in response to an ambiguous search query. 



DETAILED DESCRIPTION 
25 Reference will now be made in detail to an embodiment of the present invention as 

illustrated in the accompanying drawings. The same reference numbers may be used 
throughout the drawings and the following description to refer to the same or like parts. 
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A. Overview 

Methods and apparatus consistent with the invention allow a user to subnnit an 
ambiguous search query and to receive potentially disambiguated search results. In one 
embodiment, a sequence of numbers received from a user of a standard telephone keypad 
5 is translated into a set of potentially corresponding alphanumeric sequences. These 
potentially corresponding alphanumeric sequences are provided as an input to a 
conventional search engine, using a boolean "OR" expression. In this manner, the search 
engine is used to help limit search results to those in which the user was likely interested. 

B. Architecture 

10 FIG. 1 illustrates a system 100 in which methods and apparatus, consistent with 

the present invention, may be implemented. The system 100 may include multiple client 
devices 110 connected to multiple servers 120 and 130 via a network 140. The network 
140 may include a local area network (LAN), a wide area network (WAN), a telephone 
network, such as the Public Switched Telephone Network (PSTN), an intranet, the 

1 5 Internet, or a combination of networks. Two client devices 1 1 0 and three servers 1 20 
and 130 have been illustrated as connected to network 140 for simplicity. In practice, 
there may be more or less client devices and servers. Also, in some instances, a client 
device may perform the functions of a server and a server may perform the functions of a 
client device. 

20 The client devices 1 10 may include devices, such mainframes, minicomputers, 

personal computers, laptops, personal digital assistants, or the like, capable of 
connecting to the network 140. The client devices 110 may transmit data over the 
network 140 or receive data from the network 140 via a wired, wireless, or optical 
connection. 

25 FIG. 2 illustrates an exemplary client device 110 consistent with the present 

invention. The client device 110 may include a bus 210, a processor 220, a main 



memory 230, a read only memory (ROM) 240, a storage device 250, an input device 260, 
an output device 270, and a communication interface 280. 

The bus 210 may include one or more conventional buses that permit 
communication among the components of the client device 110. The processor 220 may 
5 include any type of conventional processor or microprocessor that interprets and 
executes instructions. The main memory 230 may include a random access memory 
(RAM) or another type of dynamic storage device that stores information and instructions 
for execution by the processor 220. The ROM 240 may include a conventional ROM 
device or another type of static storage device that stores static information and 

1 0 instructions for use by the processor 220. The storage device 250 may include a 
magnetic and/or optical recording medium and its corresponding drive. 

The input device 260 may include one or more conventional mechanisms that 
permit a user to input information to the client device 110, such as a keyboard, a mouse, 
a pen, voice recognition and/or biometric mechanisms, etc. The output device 270 may 

15 include one or more conventional mechanisms that output information to the user, 
including a display, a printer, a speaker, etc. The communication interface 280 may 
include any transceiver-like mechanism that enables the client device 1 10 to 
communicate with other devices and/or systems. For example, the communication 
interface 280 may include mechanisms for communicating with another device or system 

20 via a network, such as network 140. 

As will be described in detail below, the client devices 110, consistent with the 
present invention, perform certain searching-related operations. The client devices 110 
may perform these operations in response to processor 220 executing software 
instructions contained in a computer-readable medium, such as memory 230. A 

25 computer-readable medium may be defined as one or more memory devices and/or 




carrier waves. The software instructions may be read into memory 230 from another 
computer-readable medium, such as the data storage device 250, or from another device 
via the communication interface 280. The software instructions contained in memory 
230 causes processor 220 to perform search-related activities described below. 
5 Alternatively, hardwired circuitry may be used in place of or in combination with software 
instructions to implement processes consistent with the present invention. Thus, the 
present invention is not limited to any specific combination of hardware circuitry and 
software. 

The servers 120 and 130 may include one or more types of computer systems. 
10 such as a mainframe, minicomputer, or personal computer, capable of connecting to the 
network 140 to enable servers 120 and 130 to communicate with the client devices 110. 
In alternative implementations, the servers 120 and 130 may include mechanisms for 
directly connecting to one or more client devices 110. The servers 120 and 130 may 
transmit data over network 140 or receive data from the network 140 via a wired, 
1 5 wireless, or optical connection. 

The servers may be configured in a manner similar to that described above in 
reference to FIG. 2 for client device 110. In an implementation consistent with the 
present invention, the server 120 may include a search engine 125 usable by the client 
devices 1 1 0. The servers 1 30 may store documents (or web pages) accessible by the 
20 client devices 110. 

C. Architectural Operation 

FIG. 3 illustrates a diagram depicting three documents, which may be stored for 
example on one of the servers 130, 

A first document (Document 1 ) contains two entries— "car repair" and "car rental"— 
25 and is numbered "3" at its bottom. A second document (Document 2) contains the entry 



-6- 



"video rental". A third document (Document 3) contains three entries— "wine", "champagne", 
and "bar items"— and includes a link (or reference) to Document 2. 

For the sake of illustrative simplicity, the documents shown in FIG. 3 only contain 
alphanumeric strings of information (e.g., "car", "repair", "wine", etc.). Those skilled in the art 
5 will recognize, however, that in other situations the documents could contain other types of 
information, such as phonetic, or audiovisual information. 

FIG. 4a illustrates a conventional alphanumeric index, based on the documents 
shown in FIG. 3. The first column of the index contains a list of alphanumeric terms, and the 
second column contains a list of the documents corresponding to those terms. Some terms, 

10 such as the alphanumeric term "3", only correspond to (e.g., appear in) one document— in 
this case Document 1 . Other terms, such as "rental", correspond to multiple documents— in 
this case Documents 1 and 2. 

FIG. 4b illustrates how a conventional search engine, such as search engine 125, 
would use the index illustrated in FIG. 4a to provide search results in response to an 

15 alphanumeric search query. The alphanumeric query may be generated using any 
conventional technique. For purposes of illustration, FIG. 4b depicts two alphanumeric 
queries: "car" and "wine". Under a conventional approach, search engine 1 25 receives an 
alphanumeric query, such as "car" (stage 410), and uses the alphanumeric index to 
determine which documents correspond to that query (stage 420). In this example, a 

20 conventional search engine 1 25 would use the index illustrated in FIG. 4a to determine that 
"car" corresponds to Document 1 and would return Document 1 (or a reference to it) to the 
user as a search result. Similarly, a conventional search engine would determine that "wine" 
corresponds to Document 3 and would return Document 3 (or a reference to it) to the user 
(stage 430). 

25 FIG. 5a illustrates a flow diagram, consistent with the invention, of a preferred 

technique for providing search results in response to a numeric search query, based on the 
documents and index shown in FIGS. 3 and 4a, respectively. For the sake of illustrative 



-7- 



ease, FIG. 5a describes a particular technique for processing a numeric query based on the 
mapping of a standard telephone handset; but those skilled in the art will recognize that 
other techniques consistent with the invention may be used. 

At stage 510, a sequence "227" (consisting of numeric components "2", "2", and "7") 
5 is received from a user. At stage 520, information is obtained about how the numeric 
components map to letters. Assuming that the user entered the information from a standard 
telephone keypad, this mapping information is shown in FIG. 5b. As shown in FIG. 5b, the 
letters "a", "b", and "c" each map to the number "1". the letters "p", "q", Y', and "s" each map 
to the number "7", and so forth. 

1 0 At stage 530, using this mapping information, the sequence "227" is translated into 

its potential alphanumeric equivalents. Based on the information shown in FIG. 5b, there 
exist 36 possible combinations of letters that correspond to the sequence "227", including 
the following: aap, bap, cap, abp, bbp, ... bar ... car .. . cos. If numbers are included in 
the possible combinations (e.g., "aa7"), there would exist 80 possible combinations. Rather 

15 than generating all possible alphanumeric equivalents, it may be desirable to limit the 
generated equivalents based on some lexicon. For example, it may be desirable to 
generate only those alphanumeric equivalents that appear in a dictionary, search engine log 
of previous search queries, etc.; or to othenA/ise limit the alphanumeric equivalents by using 
known statistical techniques (e.g., the probability of certain words appearing together). 

20 At stage 540, these alphanumeric equivalents are provided as an input to a 

conventional search engine, such as that described in reference to FIGS. 4a and 4b, using a 
logical "OR" operation. For example, the search query provided to the search engine could 
be "aap OR bap OR cap OR abp . . . OR bar . . . OR car". Although all possible 
alphanumeric equivalents may be provided to the search engine, a subset may instead be 

25 used by using conventional techniques to eliminate equivalents that are unlikely to be 
intended. For example, one could generate a narrower list of possible combinations by 
using techniques that draw upon probabilistic information about the usage of letters or 
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words: one could ignore combinations that begin with "qt" but include (and favor) 
combinations that begin with "qu." 

At stage 550, search results are obtained from the search engine. Because terms 
such as "aap" and "abp" do not appear in the search engine's index, they are effectively 
5 ignored. Indeed, the only terms contained within the index shown in FIG, 4b are "car" and 
"bar", and so the only search results returned are those that reference Documents 1 and 3. 
At stage 560, these search results are presented to the user. The search results may be 
presented in the same order provided by the search engine, or may be reordered based on 
considerations such as the language of the user. Assuming that the user was only 

10 interested in documents containing the term "bar", the user would receive an undesired 
result (Document 3) in addition to the desired result (Document 1). This may be an 
acceptable price to pay, however, for the benefit of the user only having to press three keys 
to formulate the search query. 

FIG. 6 illustrates another flow diagram, consistent with the invention, of a preferred 

1 5 technique for providing search results in response to a numeric search query, based on the 
documents and index shown in FIGS. 3 and 4a, respectively. This flow diagram 
demonstrates how increasing the size of the received sequence can help limit search results 
to those desired by the user. For the sake of illustrative ease, FIG. 6 again describes a 
particular technique for processing a numeric query based on the mapping of a standard 

20 telephone handset; but those skilled in the art will recognize that other techniques consistent 
with the invention may be used. 

At stage 610, a sequence "227 48367" (consisting of numeric components "2", "2", 
"7", "4", "8", "3", "6", "7") is received from a user. For the sake of explanation, the sequence 
"227" will be called a "number word" and the entire sequence "227 48367" will be called a 

25 "number phrase." The possible alphanumeric equivalents of a number word will be called 
"letter words" and the possible alphanumeric equivalents of a number phrase will be called 
"letter phrases." 
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At stage 620, information is obtained about how the numeric components map to 
letters. Assuming the same mapping information is used as shown in FIG. 5b, at stage 630, 
the number phrase "227 48367" is translated into potentially corresponding letter phrases. 
Based on the information shown in FIG. 5b, there exist 11664 possible letter phrases that 
5 correspond to the sequence "227 48367". 

At stage 640, these letter phrases are provided as an input to a conventional search 
engine, such as that described in reference to FIGS. 4a and 4b. using a logical "OR" 
operation. For example, the search query provided to the search engine could be "'aap 
gtdmp' OR 'aap htdmp' ... OR 'bar items' ... OR 'car items'". Although all possible letter 

1 0 phrases may be provided to the search engine, a subset may instead be used by employing 
conventional techniques to eliminate letter phrases that are unlikely to be intended. 

At stage 650, search results are obtained from the search engine. Because many 
search engines are designed to rank highly those documents that contain the exact phrase 
sought. Document 3 would likely be the highest ranked search result (i.e., because it 

1 5 contains the exact phrase "bar items"). No other document in the example contains one of 
the other letter phrases generated at stage 620. Furthermore, many search engines 
downweight (or eliminate) search results that contain individual parts of a phrase but not the 
entire phrase. For example, Document 1 would be downweighted or eliminated because it 
contains the letter word "car", which corresponds to the first part of the letter phrase, but it 

20 does not contain any letter word that corresponds to the second part of the letter phrase. 
Finally, letter phrases such as "aap htdmp" are effectively ignored because they contain no 
letter words that appear in the search engine's index. 

At stage 660, the search results are presented to the user. In the example shown, 
the first result shown to the user would be Document 3, which is likely most relevant to the 

25 user's query. Document 1 may be eliminated altogether, because it does not contain one of 
the possible letter phrases. In this manner, the user is provided with the most relevant 
search results. 
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Although the descriptions above in reference to FIGS. 5 and 6 are made in reference 
to receiving numeric information and mapping it to alphanumeric infomnation, those skilled in 
the art will recognize that other implementations are possible consistent with the invention. 
For example, instead of receiving a sequence of numbers corresponding to the keys 
5 pressed by a user, the received sequence may consist of the first letters corresponding to 
the keys pressed by the user. In other words, instead of receiving "227", the received 
sequence may be "aap". Consistent with the invention, the equivalents letter sequences 
generated in stages 530 or 630 could then be other letter sequences (e.g., "bar") that 
correspond to "aap." Indeed, the received sequence may contain phonetic, audiovisual, or 
1 0 any other type of information components. 

Regardless of the form in which the sequence is received, it is generally preferred 
that the received sequence be translated into a sequence that corresponds to the format in 
which information is stored in the search engine's index. For example, if the search engine's 
index is stored in alphanumeric format, the received sequence should be translated into 
1 5 alphanumeric sequences. 

Furthermore, it is generally preferred that the mapping technique that is used to 
translate the received sequence of information components be the same technique that is 
employed at the user's device to map the user's input to the information generated by the 
device. There may, however, be instances where it is preferable to use a different mapping 
20 technique than is used for user input. 
D. Conclusion 

As described in detail above, methods and apparatus consistent with the invention 
provide search results in response to an ambiguous search query. The foregoing 
description of an implementation of the invention has been presented for purposes of 
25 illustration and description. Modifications and variations are possible in light of the above 
teachings or may be acquired from practicing the invention. 

For example, although the foregoing description is based on a client-server 
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architecture, but those skilled in the art will recognize that a peer-to-peer architecture maybe 
used consistent with the invention. Moreover, although the described implementation 
includes software, the invention may be implemented as a combination of hardware and 
software or in hardware alone. Additionally, although aspects of the present invention are 
5 described as being stored in memory, one skilled in the art will appreciate that these aspects 
can also be stored on other types of computer-readable media, such as secondary storage 
devices, like hard disks, floppy disks, or CD-ROM; a carrier wave from the Internet; or other 
forms of RAM or ROM. The scope of the invention is therefore defined by the claims and 
their equivalents. 
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